Goto

Collaborating Authors

 clip score


Appendix

Neural Information Processing Systems

We provide concrete rules below for the two competition tracks that comprise DATACOMP: filtering and BYOD . Additionally, we provide a checklist, which encourages participants to specify design decisions, which allows for more granular comparison between submissions. A.1 Filtering track rules Participants can enter submissions for one or many different scales: small, medium, large or xlarge, which represent the raw number of image-text pairs in CommonPool that should be filtered. After choosing a scale, participants generate a list of uids, where each uid refers to a COMMONPOOL sample. The list of uids is used to recover image-text pairs from the pool, which is used for downstream CLIP training.




In search of the next generation of multimodal datasets

Neural Information Processing Systems

While these advances use different algorithmic techniques, e.g., contrastive learning, diffusion, or auto-regressive modeling, they all rest on a common foundation: large datasets containing paired image-text examples.


BitsFusion: 1.99 bits Weight Quantization of Diffusion Model Y ang Sui 1,2, Y anyu Li

Neural Information Processing Systems

Diffusion-based image generation models have achieved great success in recent years by showing the capability of synthesizing high-quality content. However, these models contain a huge number of parameters, resulting in a significantly large model size. Saving and transferring them is a major bottleneck for various applications, especially those running on resource-constrained devices.